SWEclat: a frequent itemset mining algorithm over streaming data using Spark Streaming
نویسندگان
چکیده
منابع مشابه
Frequent Itemset Mining over Stream Data: Overview
During the past decade, stream data mining has been attracting widespread attentions of the experts and the researchers all over the world and a large number of interesting research results have been achieved. Among them, frequent itemset mining is one of main research branches of stream data mining with a fundamental and significant position. In order to further advance and develop the researc...
متن کاملFrequent Data Itemset Mining Using VS_Apriori Algorithms
The organization, management and accessing of information in better manner in various data warehouse applications have been active areas of research for many researchers for more than last two decades. The work presented in this paper is motivated from their work and inspired to reduce complexity involved in data mining from data warehouse. A new algorithm named VS_Apriori is introduced as the ...
متن کاملStreaming Queries over Streaming Data
Recent work on querying data streams has focused on systems where newly arriving data is processed and continuously streamed to the user in real-time. In many emerging applications, however, ad hoc queries and/or intermittent connectivity also require the processing of data that arrives prior to query submission or during a period of disconnection. For such applications, we have developed PSoup...
متن کاملYAFIMA: Yet Another Frequent Itemset Mining Algorithm
Efficient discovery of frequent patterns from large databases is an active research area in data mining with broad applications in industry and deep implications in many areas of data mining. Although many efficient frequent-pattern mining techniques have been developed in the last decade, most of them assume relatively small databases, leaving extremely large but realistic datasets out of reac...
متن کاملStreaming Twitter Data Analysis Using Spark for Effective Job Search
Near real time Big Data from social network sites like Twitter or Facebook has been an interesting source for analytics by researchers in recent years owing to various factors including its up-to-date-ness, availability and popularity, though there may be a compromise in genuineness or accuracy. Apache Spark, the trendy big data processing engine that offers faster solutions compared to Hadoop,...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: The Journal of Supercomputing
سال: 2020
ISSN: 0920-8542,1573-0484
DOI: 10.1007/s11227-020-03190-5